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Abstract. The Frankl's conjecture, formulated in 1979. and still open, 
states that in every family of sets closed for unions there is an element 
contained in at least half of the sets. FC-families are families for which 
it is proved that every union-closed family containing them satisfies the 
Frankl's condition (e.g., in every union-closed family that contains a one- 
element set a, the element a is contained in at least half of the sets, so 
families of the form a are the simplest FC-families). FC-families play an 
important role in attacking the Frankl's conjecture, since they enable 
significant search space pruning. We present a formalization of the com- 
puter assisted approach for proving that a family is an FC- family. Proof- 
by-computation paradigm is used and the proof assistant Isabelle/HOL 
is used both to check mathematical content, and to perform (verified) 
combinatorial searches on which the proofs rely. FC-families known in 
the literature are confirmed, and a new FC-family is discovered. 

1 Introduction 

Formalized mathematics and interactive theorem provers (sometimes referred to 
as proof assistants) have made great progress in recent years. Many classical 
mathematical theorems have been formally proved and proof assistants have 
been intensively used in hardware and software verification. The most successful 
proof assistants now days are Coq, Isabelle/HOL, HOL Light, etc. 

Several of the most important results in formal theorem proving are for the 
problems that require proofs with much computational content. These proofs 
are usually highly complex (and therefore often require justifications by for- 
mal means) since they combine classical mathematical statements with com- 
plex computing machinery (usually computer implementation of combinatorial 
algorithms). The corresponding paradigm is sometimes referred to as proof-by- 
evaluation or proof-by- computation. Probably, the most famous examples of this 
approach are the proofs of the Four-Color Theorem and the Kepler's conjecture. 

Georges Gonthicr has formalized a proof of the Four- Color Theorem^ in Coq 
[5] . The Four Colour Theorem is famous for being the first long-standing mathe- 
matical problem, analyzed by many famous mathematicians, finally resolved by 
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a computer program (Appel and Haken [2]). This proof broke new ground be- 
cause it involved using IBM 370 assembly language computer programs to carry 
out a gigantic case analysis, which could not be performed by hand. The proof 
attracted criticism: computer programming is known to be error-prone, and dif- 
ficult to relate precisely to the formal statement of a mathematical theorem. 
Several attempts to simplify the proofs were made (e.g., Robertson et al. [T5]). 
number of cases was reduced and programs were written in C instead of assem- 
bly language. However, all doubts were removed only when Gonthier employed 
proof assistant Coq reducing the whole proof to several basic logical principles. 

Another example of a similar kind is the proof of Kepler's conjectural. As 
described by Nipkow et al. [5]: "In 1998. Thomas Hales announced the first (by 
now) accepted proof of Kepler's conjecture. It involves 3 distinct large computa- 
tions. After 4 years of refereeing by a team of 12 referees, the referees declared 
that they were 99% certain of the correctness of the proof. Dissatisfied with this, 
Hales started the informal open-to-all collaborative flyspeck project to formalize 
the whole proof with a theorem proof." 

In this work, we apply the proof-by-evaluation paradigm to a problem of 
verifying FC-families — a special case of the Frankl's conjecture. Frankl's con- 
jecture, an elementary and fundamental statement formulated by Peter Frankl 
in 1979., states that for every family of sets closed under unions, there is an 
element contained in at least half of the sets (or, dually, in every family of sets 
closed under intersections, there is an element contained in at most half of the 
sets). Up to the best of our knowledge, the problem is still open. The conjecture 
has been proved for many special cases. In particular, it is known to be true for: 
(i) families of at most 36 set^fl [3] ; (ii) families of sets such that their union has 
at most 11 elements [3]. 

FC-families are families for which it is proved that all union closed families 
containing them satisfy the Frankl's condition (if the Frankl's conjecture would 
be proved, then every family would be an FC-family). For example, it can easily 
be shown that if a family contains a one-element set, then it satisfies the Frankl's 
condition. Similar results holds for any two-element set, etc. FC-families are 
important building block for attempting to prove the Frankl's conjecture since 
they justify pruning large portions of the search space. 

Related work. The Frankl's conjecture has also been formulated and studied as 
a question in lattice theory [1211) . 

FC-families have been introduced by Poonen [11] and further studied by Gao 
and Yu [5], Vaughan [14115116] . Morris [5], Markovic [7], Bosnjak and Markovic 
[3], and Zivkovic and Vuckovic [I?] . 
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The basic technique used (the Frankl's condition characterization based on 
weight functions and shares) is introduced by Poonen [IT] and later successfully 
used by Bosnjak and Markovic |7I3) . and Zivkovic and Vuckovic [17]. 

First attempts in using computer- assisted computational approach on solving 
special cases of the Frankl's conjecture are described by Zivkovic and Vuckovic 
[17] . Computations are performed by (unverified) Java programs. However, in 
order to increase the level of trust, Java programs generate certificates that can 
be checked by independent tools. 

The present paper represent a formalized reformulation of the results of 
Zivkovic and Vuckovic |17) . All mathematical content is rigorously formalized 
within Isabelle/HOL and proofs are mechanically checked. JAVA programs are 
reimplemented in a functional language of Isabelle/HOL and their correctness is 
formally verified. A clear separation of mathematical and computational content 
is done and parts of the proofs that rely on computations are clearly isolated. 
Since the whole formalization is performed and verified within a proof assistant, 
there is no need for explicit certificates for statements proved by computation. 

Our main contribution are rigorous, machine-verifiable proofed that all FC- 
families previously described in the literature are indeed FC-families. Unlike 
most pen-and-paper proofs, our proofs follow a uniform approach, supported by 
an underlying combinatorial search procedure. The second contribution is a new 
type of FC-families: four three-element sets all contained in a seven-element set. 

Background logic and notation. Logic and the notation given in this paper will 
follow Isabelle/HOL. Isabelle/HOL [10. is a development of Higher Order Logic 
(HOL), and it conforms largely to everyday mathematical notation. The basic 
types include truth values (boot), natural numbers (nat) and integers (int). Func- 
tions can be defined by recursion (either primitive or general). Sets over type a, 
type a set, follow the usual mathematical convention^. Sets of sets (i.e., object 
of the type a set set) are called families. Set of all subset for a set A is denoted 
by pow A, and its number of elements is denoted by \A\. Lists over type a, type 
a list, come with the empty list [], the infix prepend constructor the infix @ 
that appends two lists, and the conversion function set from lists to sets. N-th 
element of a list I is denoted by lt n i. List [0, 1, . . . , n — 1] is denoted by [0.. < n]. 
The function sort sorts a list, listsum calculates its sum, and remdups removes 
duplicate elements. List with no repeated elements are called distinct. Standard 
higher order functions map, filter, foldl are also supported (for details see [TU]L 

All definitions and statements given in this paper are formalized within Is- 
abelle/HOL. However, in order to make the text accessible to a more general 
audience not familiar with Isabelle/HOL, many minor details are omitted and 
some imprecisions are introduced (for example, we used standard symbolics used 
in related work, although it is clear that some symbols are ambigous). Statements 

4 Corresponding Isabelle/HOL proof documents are available from 
http: //argo .matf . bg. ac .rs 

5 In a strict type setting, sets containing elements of mixed types are not allowed. 



are grouped into propositions, lemmas, and theorems. Propositions usually ex- 
press simple, technical results and are printed here without proofs. All sets and 
families are considered to be finite and this assumptions (present in Isabelle/HOL 
formalization) will not be explicitly stated in the rest of the paper. 

Outline. The rest of the paper is organized as follows. In Section [5] we give 
mathematical background on union-closed families, the Frankl's conjecture and 
prove main theoretical results. In Section[3]we formulate the combinatorial search 
algorithm, prove its correctness and give its efficient implementation. In Section 
[5] we introduce uniform families and techniques used for avoiding symmetries 
when analyzing them. In Section[5]we verify several kinds of uniform FC-families. 
Finally, in Section [5] we draw conclusions and give directions for further work. 

2 Frankl's Families 

2.1 Union Closed Families 

First we give basic definitions of union-closed families, closure under unions, and 
operations used to incrementally obtain closed families. 

Definition 1. Let F and F c be families. 

F is union closed, denoted by uc F, iff VA G F. VB £ F. A U B £ F. F is 
union closed for F c , denoted by \ic Fc F, iff uc FA(VA £ F.VB G F c . AUB G F). 

Closure of F, denoted by (F), is the minimal family of sets (in sense of 
inclusion) that contains F and is union closed. Closure of F for F c , denoted by 
(F) F , is the minimal family of sets (in sense of inclusion) that contains F and 
is union closed for F c . 

Insert and close operation of set A to family F, denoted by ic A F, is the 
family F U {A} U {A U B. B £ F}. Insert and close operation for F c of set 
A to family F, denoted by \c Fc A F, is the family F U {A} U {AU B. B e 
F} U {AUB. B G F c }. 

Proposition 1. 

1- (F)={[jF'. F' £ pow F - {0}} 

2. (F U {A}} = ic A (F), (F U {A})j = ic, A (F) 

3- If F C pow (JA and uca F then uc^ F. 

2.2 The Frankl's Condition 

The next definition formalizes the Frankl's condition and the notion of FC-family. 

Definition 2. Family of sets F satisfies the Frankl's condition and we say that 
it is a Frankl's family, denoted by frankl F, if it contains an element that occurs 
in at least half sets in the family, i.e., frankl F = 3a. a G 1J F A 2-ff a F > \F\, 
where # a F denotes \{A £ F. a e A}\ 

Family of sets F c is FC-family if it is proved that every union closed family 
such that F D F c is Frankl's. 



2.3 Family Isomorphisms 



The domain of the family does not play any important role for many properties 
related to the Frankl's condition — many properties arc invariant for domain 
changes using injective functions (that establish a kind of isomorphisms between 
two families). Therefore, in many cases it suffices to consider only families over 
canonical domains — initial ranges {0, 1, . . . , n — 1} of natural numbers. 

Proposition 2. Let F be a family of sets and f a function injective on {JF. 
Let F' be the image of F under f (then f is a bijection between (J F and (J F' ). 

1. Ifae\jF, then# a F = # f{a) F'. 

2. \F\ = \F'\ 

3. If A £ F and A' £ F' is the image of A under f, then \ A\ = \ A'\. 
4- F is union closed if and only if F' is. 

5. F is Frankl's if and only if F' is. 

6. If F' is an FC-family, then so is F. 



2.4 FC Characterization by Weight Functions and Shares 

We describe the central technique for proving that a family is FC-family, relying 
on characterizations of the Frankl's condition using weights and shares. 

Definition 3. A function w : X — > N is a weight function on A C X , denoted 
by wfA w, iff 3a £ A. w(a) > 0. Weight of a set A wrt. weight function w, 
denoted by w(A), is the value J2a£A w ( a )- Weight of a family F wrt. weight 
function w, denoted by w(F), is the value X)ag_f 

Lemma 1. frankl F 3w. wffljF) w A 2 ■ w(F) > w(\JF) ■ \F\ 

Proof. Assume frankl F and let a be the element satisfying the Frankl's condition. 
Let w be the weight function assigning 1 to a and to all other elements. Since 
w(F) = # a F and w({J F) — 1, the statements holds. 

Conversely, suppose that — ifrankl F. Then, for every a £ 1JF, 2 • # a F < \F\. 
Hence, 2 ■ w(F) = J2 ae[JF w(a) ■ 2 ■ # a F < \F\ ■ J2 ae[j F w(a) = \F\-w({JF). 

A concept that will enable a slightly more operative formulation of the pre- 
vious characterization is the concept of s/iarjf]. 

Definition 4. Let w be a weight function. Share of a set A wrt. w and a set X , 
denoted by QAwX , is the value 2 ■ w(A) — w(X). Share of a family F wrt. w and 
a set X, denoted by wx{F), is the value ^2 A£F QAwX . 

6 Note that in order to accommodate for computer implementation only integer 
weights are allowed, and to avoid rational numbers share of a set A is defined as 
2 • w(A) — w(X), instead of w(A) — w(X)/2 that is used in the literature. 



Example 1. Let w be a function such that w(o,q) = l,w(a\) — 2, and w[a) = 
for all other elements, w is clearly a weight function. Then, w({ao, ax, a 2 }) = 3 
andw({{ao,ai},{ai,a 2 },{ai}}) = 7. Also, 6{ai, a 2 }w{a , ax, a 2 } = 2-tu({at, 02})— 
tu({ao,ai,a 2 }) =4-3 = 1, and W{ 00)01)03 }({{oo, ax}, {ax,a 2 }, {ai}}) = (2 • 3 - 
3) + (2 -2-3) + (2 -2-3) = 5. 

Proposition 3. w x {F) = 2 • w(F) - w(X) ■ \F\ 

Lemma 2. frankl F •<=>■ 3w. wfnjF) w A F ) (F) > 

Proof. Follows directly from Proposition [3] and Lemma [TJ 

Hypercubes. Sets of a family can be grouped into so called hypercubes. 

Definition 5. An S'-hypercube with a base K , denoted by hc K , is the family 
{A. K C A A A C K U S}. Alternatively, a hypercube can be characterized by 
hc| = {if UA. 4e pow S}. 

Example 2. Let S = {s ,si}, and X = {k ,ki}. If X' C K, then all 5- 
hypercubes with a base if' are: 

hc f} = {{}> { s o}, {si}, {s , Sl}} 
hc O=o} = ii k o}, {ko, so}, {k , si}, {fc , so, si}} 
hc ffci} = {{M, { fe i)*o}, {ki, si}, {fei, so, si}} 
hcffccfci} = {{ fc o, fei}, {fco, ki, s }, {fc , fei, si}, {k , ki,s , si}} 

Previous example indicates that (disjoint) S'-hypercubes can span the whole 
pow (if U S). Indeed, this is generally the case. 

Proposition 4. (i) pow (if US') = \J K > CK hc^-,. (it) If K\ and K 2 are different 

s s 

and disjoint with S , then hc Kl and hc^- 2 are disjoint. 

Families of sets can be separated into (disjoint) parts belonging to different 
hypercubes (formed as hc^- n F). 

Definition 6. A hyper-share of a family F wrt. weight function w, the hyper- 
cube hcf- and the set X, denoted by ivf cx (F), is the value X^Aehc s n p$AwX. 

Example 3. Let S and if be as in the Example [3J let X = if U S, let F = 
{{s },{s 1 },{k ,s },{k ,k 1 ,s ,si}},a,ndw(a) = 1 for all a e X. Then, w^ }x (F) = 

K{s }wX + Q{ Sl }wX = -4, wf ko}x (F) = 6{fc , s }wX = 0, wf ki}x (F) = 0, and 

Share of a family can be expressed in terms of sum of hyper-shares. 
Proposition 5. IfKUS = \J F andKDS = 0, then W(\j pj (F) = J^K'CK^K'd) 

Lemma 3. Let w be a weight function on [J F. If if U S — (J F, if n S = 0, 
andVK' C if. «}f ,,, ]F1 (F) > 0, i/ien frankl F. 



Proof. Immediate consequence of Proposition [5] and Lemma [5] 



Definition 7. Projection of a family F onto a hypercube hc^-, denoted by 
hc^ [F\, is the set {A - K. A e hcf- n F}. 

Example 4- Let K, S and F be as in Example^ Then hcf } [FJ = {NM^i}}, 
hcf fco} [F\ = {{so}}, hcf fci} [F\ - {}, and hcf feo>fel} [F\ = {{s , Sl }}. 

Proposition 6. 

1. If K nS = and K' C A, ften hc|, [^J £ pow S 

2. If uc is f/iera uc (hc| 

3. If uc F, F C CF,S = \JF C) KnS = 0, then uc Fc (hcf L^J)- 
^. I/Vz € K. w(x) = 0, then w s KX {F) = w x {hc s K [F\). 

Union closed extensions. The next definition introduces an important notion for 
checking FC-families. 

Definition 8. Union closed extensions of a family F c are families that are cre- 
ated from elements of F c and are union closed for F c . Family of all union closed 
extensions is denoted by uce F C! and uce F c = {F 1 . F' C pow [JF C A ucp c F'}. 

Lemma 4. Let F be a non-empty union closed family, and let F c be a subfamily 
(i.e., F c C F). Let S denote [JF C , and let K denote 1J F — [jF c . Let w be 
a weight function on [JF, that is zero for all elements of K. If shares of all 
union closed extension of F c are nonnegative, then F is Frankl's, i.e., if\/F' S 
uce F c . w/jj F \{F') > 0, then frankl F. 

Proof. Since, K U S = [JF and K n S — 0, by Lemma [3l it suffices to show 
that VA" C K. Wk'(\Jf)( F ) - °- Fix K ' and ass u me tnat K ' £ K - Since w is 
zero on K, by Proposition [6l it holds that Wj^^\j f ^(F) = W(\j p) (hcf^, [F\). On 
the other hand, since uc F, F c C F, and K n S = 0, by Proposition [5] it holds 
that ucp c (hc^-, [F\). Moreover, hc^, L^J C pow S, so hc^-, [F\ e uce F c . Then, 
uinj p> (hog, [F\ ) > holds from the assumption. However, since w is zero on K, 
it holds that w(\JF c ) = w([jF) and w (UF) (hc|, [F\) = w (u p c) (hcf , LFJ) > 

Theorem 1. A family F c is an FC- family if there is a weight function w such 
that shares (wrt. w and IJ F c ) of all union closed extension of F c are nonnegative. 

Proof. Consider a union-closed family F D F c . Let w be the weight function 
such that VF' e uce F c . Wf\jp\(F') > 0. Let w' be a function equal to w on 
\JF C and on other elements. Since VF' 6 uce F c . w' i\j F ^ (F') — w)(y p\(F'), 
Lemma H] applies to F and F is Frankl's. 



3 Combinatorial search 



Theorem [1] inspires a procedure for verifying FC families. It should take a weight 
function on (J F c and check that all union closed extensions of F c have non- 
negative shares. We will now define a procedure SomeShareNegative, denoted 
by ssn F c w, such that if ssn F c w = _L, then for all F 1 e uce F c it holds 
that wnj p\ (F 1 ) > 0. The heart of this procedure will be a recursive function 
ssn F c ,tu,X £ that preforms a systematic traversal of all union closed extensions 
of F c , but with pruning that speeds up the search. If a union closed extension of 
F c has a negative share, it must contain one or more sets with a negative share. 
Therefore, a list L of all different subsets of (J F c with negative shares is formed 
and each candidate family is determined by elements of L that it includes. A 
recursive procedure creates all candidate families by processing elements of L 
sequentially, either skipping them (in one recursive branch) or including them 
into the current candidate family F t (in the other recursive branch), maintain- 
ing the invariant that the current candidate family F t is always union closed. 
If the current element of L has been already included in F t (by earlier closure 
operations required to maintain the invariant) the search can be pruned. If the 
sum of (negative) shares of the remaining elements of L is less then the (non- 
negative) share of the current Ft, then F t cannot be extended to a family with 
a negative share (even in the extreme case when all the remaining elements of 
L are included) so, again, the search can be pruned. 

Definition 9. The function ssr\ Fc,w ' L F t is defined by a primitive recursion 
(over the structure of the list L): 

wx(F t ) <0 

iiw x (F t )+ ^2 > then _L 

Aeh # t 

else if ssn F "' w ' X t F t then T 
else if h £ Ft then _L 
else ssn F "' w ' x t (ic Fc h F t ) 

Let L be a distinct list such that its set is {A. A G pow 1J F c A QAwX < 0}. 

ssn F c w = shiW.*.(U^) L 

Next we prove the soundnes of the ssn F c w function. 

Lemma 5. If (i) ssr\ Fc ' w - x L Ft = JL, (ii) for all elements A in L it holds that 
BAwX < 0, (iii) for all A € F' - F u ifttAwX < 0, then A is in L, (iv) F' D F t , 
and (v) ucf c F', then wx(F') > 0. 

Proof. The proof is by induction. First, note that 



SSn Fa,W,X [] Ft = 

ssn F °< w ' x (h # t) F t = 



wx(F') = ftAwX = &AwX + &AwX. 

AeF' A€F t A£F'-F t 



(1) 



Consider the base case of L = [}. Since ssr\ Fc ' w ' X [] Ft = ±, it holds that 
J^AeFt RAwX = wx(Ft) > and first term in {jXJ) is nonnegative. If there were 
some A £ F' — Ft such that QAwX < 0, then, from the assumptions it would be 
in L, which is impossible since L is empty. Therefore, the second term in (fTJ) is 
also nonnegative which completes the proof. 

Consider the inductive step, and assume that L = h # t. 

First consider the case when wx(Ft) + J^Aeh # t $AwX > 0. Let P denote 
the set {A. A £ F' - F t A QAwX > 0}, and let N denote the set {A. A e 
F' — F t A&AwX < 0}. Since, by assumptions, all elements of N are in L = h # t, 
and since, by assumptions, all shares oi h # t — N are negative, it holds that 

tiAwX = KAwX + KAwX < KAwX. (2) 

Aeh # t AeJV Aeh # t-N AGN 

It holds that Y^AeF'-F RAwX = J2 AeP QAwX + J2 A£N &AwX. Therefore, 
since all shares of P are nonnegative, from ([T]) and (J2J) and the assumption of 
the current case it holds that 

wx(F') > KAwX + SAwX > wx(Ft) + ^ QAwX > 0. 

AeF t AeN Aeh # i 

Next, consider the case when wx(F t ) + J2 Aeh # t QAwX < 0. Since, by as- 
sumptions, ssn Fa,w ' X (h # t) F t — _L, by the definition of ssn it must hold that 
ssn F c ,w,x t p t = .L. 

Consider the case when h £ F t or h ^ F' . Then h ^ F' — F t . The conclusion 
follows by induction hypothesis for the recursive call ssn Fa ' w ' X t F t , since all 
assumptions are satisfied. Indeed, all elements of F' — F t with negative shares 
must be in t, since h ^ F' — F t , and other assumptions are trivially satisfied. 

Finally, consider the case when h ^ F t and h £ F'. The conclusion follows 
by induction hypothesis for the recursive call ssr\ Fc ' w ' X t (\cf c h F t ), since all 
assumptions are satisfied for this call. Indeed, in this case ssn Fc,w ' x (h #t) Ft = 
ssn F c ,w,x ^ (\cp c h F t ) and the left hand side is _!_ from the current assumptions. 
All elements of F' — \Cf c h F t with negative shares must be in t. Indeed, this 
holds since F t Q \cf c h Ft, and h £ \cp c h F t , and since all elements of F' — F t 
with negative shares are in h # t. It holds that \cf c h F t C F' since F t C F', 
h £ F' and uc Fc F'. Other assumptions trivially hold. 

Theorem 2. //ssn F c w — _L and F' £ uce F c then W(\j Fc ){F') > 0. 

Proof. Fix F' from uce F c . Then F' C pow 1J F c and ucf c F' . Let L be a distinct 
list such that its set is {A. A £ pow [JF C A QAwX < 0}. From ssn F c w = ± 
and the definition of ssn it holds that ssn^^'CJ Fc ^ L = _!_. All assumptions 
of Lemma [5] apply. Indeed, for all A in L, QAw({J F c ) < 0. For all A in F' - 0, 
if 8Aw({J F c ) < 0, then, since F' C pow U F c , A is in L. C F'. Since uc Fc F', 
by Proposition [TJ it holds that uc^ c ) F'. Therefore, w^j Fc \ (F') > holds. 



Apart from being sound, the procedure can also be shown to be complete. 
Namely, it could be shown that if ssn F c w = T, then there is an F' G uce F c 
such that W(|jF c )(f') < 0- This comes from the invariant that the current family 
F t in the search is always in uce F c , which is maintained by taking the closure 
ici? c h F t whenever an element h is added. Since this aspect of the procedure is 
not relevant for the rest of the proofs, it will not be formally stated nor proved. 

3.1 Efficient implementation 

In order to obtain executability and increase efficiency, a series of refinements of 
ssn F w is done. Each refined version introduces a new implementation feature 
that makes it more efficient than the previous one, but still equivalent with it. 

First, a function cannot operate on families of sets. Without loss of generality, 
it suffices only to consider families of sets of natural numbers. Sets of natural 
numbers are represented by natural number codes. A set A is represented by the 
code A = ^2 keA 2 k . Families of sets of natural numbers F are represented by 
(distinct) lists of natural number codes F. This representation will be referred 
to as list-of-nats representation (e.g., F = {{0, 1}, {1, 2}, {0, 1, 2}} is represented 
by the list-of-nats F — [3,6,7]). Basic set operations have their corresponding 
list-of-nat counterparts. 

— The union of two sets U corresponds to bitwise disjunction (denoted by U). 
It holds that if C = A U B, then C = A U B. 

— Adding a set A to a family of sets F (i.e., AU F) corresponds to the operation 
(also denoted by U) that prepcnds A to F, but only if it is not already 
present, i.e., by: if A G F then F else A # F. It holds that if F' = A U F, 
then F' = A U F. 

— Union of two families (i.e., F' U F), also denoted by U, is performed by 
iteratively adding sets from one family to another, i.e., as foldl (A A F. A U 
F) F F'. It holds that if F" = FUF', then F" = F U F>. 

— Adding a set A to all members of a family of sets F (i.e., {AU B. Be F}), 
denoted by [A U B. B G F], is performed by map (AB.iu B) F. It holds 
that if F' = {A U B. B e F}, then F> = [A U B. B G F\. 

— Insert and close for F (i.e., \cp c a F), denoted by ic, is computed as ([A] @ [A U 
B. B G F] @ [A U B. B G F c \) U F. It holds that if F' = \c Fc a F, then 
F' = \cp c a F. 

Important optimization to the basic ssn F c w procedure is to avoid repeated 
computations of family shares (both for the elements of the list L and the current 
family F t ). So, instead of accepting a list of families of sets L, and the current 
family of sets F t , the function is modified to accept a list of ordered pairs where 
first component is a list-of-nats representation of corresponding element of L, and 
the second component is its share (wrt. w and A), and to accept an ordered pair 
(Ft, St) where F t is the list-of-nats representation of F t , and St is its family share 
(wrt. w and A). The summation of shares of elements in L is also unnecessarily 
repeated. It can be avoided if the sum (si) is passed trough the function. 



F c ,w,X 



([],0) (F t ,8t) = at<0 



ssn F "' w ' x ((h, s h ) # t, si) (F t , s t ) ee if s t + s t > then _L 

else if ssn Fc ' w ' x (t, si - s h ) (F t , s t ) then T 

else if h £ F t then _L 

else let Ft = \Cp h F t ; s' t = wx(Ft) in 

ssn F ^ x (t,ls-s h ) (Ft, s't) 



Another source of inefficiency is the calculation of wx(F t ). If performed 
directly based on the definition of family share for F t , the sum would contain 
shares of all elements from F t and of all elements that are added to F t when 
adding h and closing for F. However, it is already known that the sum of shares 
for elements of F t is s t and the implementation could benefit from this fact. Also, 
calculating shares of sets that are added to F t can be made faster. Namely, it 
happens that set share of a same set is calculated over and over again in different 
parts of the search space. So, it is much better to precompute shares of all sets 
from pow X and store them in a lookup table that will be consulted each time 
a set share is needed. Note that in this case there is no more need to pass the 
function w itself, nor the domain X, but only the lookup table, denoted by s w . 

ssn F <=' s ™ ([],0) (F t , s t ) = s t <0 
ssn F - s ™ ((h, s h ) # t, Sl ) (Ft, s t ) = if s t + s t > then _L 

else if ssn Fc ' a "' (t, si — s h ) (F t , s t ) then T 
else if h € F t then _L 
else ssn Fc,s ™ (t,si - s h ) (\cp c h (F t ,s t )) 
fc£ h (F t , s t ) = let add = [h] @ [h U A. A € F t ] @ [h U A. A e F c \; 

add = filter (AA A ^ F) (remdups add) in 
(add @ F, s + listsum (map s w add)) 

It is shown that this implementation is (in some sense) equivalent to the 
starting, abstract one. This proof is technically involved, but conceptually unin- 
teresting so we omit it in the text. 



4 Uniform nfcm-families 

Most FC-families that are considered in this paper are uniform, i.e., consist of 
sets having the same number of elements. 

Definition 10. A family of sets F is a uniform nfcm-family if it contains m 
different sets, each containing k elements and their union has at most n elements. 
Uniform nkm-family is natural if its union is contained in {0, 1, . . . , n — 1}. 



Within the Isabelle/HOL implementation, natural nkm- families will be rep- 
resented by nkm-lists — (lexicografically) sorted, distinct lists of length m con- 
taining sorted, distinct lists of length k with all elements contained in {0, 1, . . . , n— 
1}. To simplify presentation, we will identify natural nkm- families with their 
corresponding nfcm-lists. Assuming that the Isabelle/HOL function comb I k 
generates all sorted fc-element sublists of a sorted list I, all nfcm-lists for given 
n, k and m can be generated by fams™ fem = comb (comb [0.. < n] k) m. 

Symmetries. Often one uniform nfcm-family can be obtained from the other 
by permuting its elements (e.g., {{arj> 0*2}, (13, {a%, 03, GS4}} can be 
obtained from {{do, a\, 122}, {arj, a\, 03}, {a%, (13, CI4}} by the permutation (ao, 
a>i, 0-2, 03, 04) (03,14,01,02,00))- Applying permutations on sets and fam- 
ilies can be implemented in Isabelle/HOL by the functions perm_set A p = 
sort (map (Xx. p\S) A) and perm_fam F p = sort (map perm_set F). Permuta- 
tions establish bijections between natural uniform families: 

Proposition 7. If p is a permutation of [0,1, ... ,n — 1] and F is a natural 
uniform family, then perm_fam F p is also natural uniform family and there is 
a bijection between F and perm_fam F p. 

Since, by Proposition^ FC-families are preserved under bijections (isomor- 
phisms), to check if all elements of a given list of nkm- families F arc FC-families, 
many elements need not be considered. Indeed, it suffices to consider only a list 
(denoted by nef p J-) of its non-equivalent representatives (under a given list of 
permutations P) . Computation of such representatives can start from the given 
list T , choose its arbitrary member for a representative, remove it and all its 
permuted variants from the lists, and repeat this sieving process until the list 
becomes empty. Isabelle/HOL implementation of this procedure can be given 
by: 

nef_aux P T T r = case T of [] => T r 

F # _ let Tp — remdups (map (A p. perm_fam F p) P) in 
nef_aux P (filter (A F. F £ T^) T) (F # J>) 
nef P T = nef_aux F T [] 

The following lemma proves the correctness of this implementation. 

Lemma 6. If P is a list of permutations of [0,1, ... ,n — 1] and if J- is a list 
of natural nkm-families, then for each element F G F there is an F' e ner T 
such there is a bijection between F and F' . 

Proof. First, note that the function nef_aux p T T r is monotone, i.e., T r C 
nef_aux p T T r . 

By induction, we show that if the assumptions hold for F and P , then for 
each element F E F there is an element F' 6 nef_aux p F F r such there is a 
bijection between F and F' . 



In the base case, when T is empty, the statement trivially holds. 

Assume that T = F # IF' . Let Fp denote all different families obtained by 
permuting F by all elements of P (i.e., J-p = remdups (map (Ap. perm_fam F p) P)) 
and let T~ denote what remains of J 7 when those are removed (i.e., T~ = 
filter (A F. F T. It holds that nef_aux p T T r = nef_aux p T~ (F # J 7 .). 

Let F' be an arbitrary element from T . Since T = F # J 7 ', either F' = F or 
F' € P. 

Assume that F' = F. By monotonicity it holds that F 6 nef_aux p T J- r , 
so F is an element from nef_aux p T T r such that there is a bijection (identity 
function) between F' and it. 

Assume that F' £ T' . 

Consider the case when F' £ J 7 ^. Then there is p £ P such that F' = 
perm_fam F p. Since F' £ T is natural and p £ P is a permutation of [0, 1, ... , n— 
1], by Proposition[7l there is a bijection between F and i 7 ". Since, by monotonic- 
ity, it holds that F £ nef_aux p T J- r , F is an element in nef_aux p T T r such 
that there is a bijection between F' and it. 

Consider the case when F 1 ^ IF p. Then F 1 G T~ . By inductive hypothesis 
for the call nef_aux p T~ (F # F r ), there is an element F" in F # T r such 
that there is a bijection between F' and it. By monotonicity, F" € F # J> C 
nef_aux p J 7- (F # J>) = nef_aux p J 7 J>, so the statement holds. 

Finally, the following lemma shows that only non-equivalent representatives 
need to be considered when checking FC-families. 

Lemma 7. Let T C fams m and P C perm [0, 1, . . . , n — 1] . If all families 
represented by elements of nef p T are FC-families, then all families represented 
by elements o/fams™ fc " 1 are FC-families. 

Proof. Let F £ fams™ fcm . By Lemma [6] there is an F' G nef p T and a bijection 
between F and F'. So, F 1 is an FC-family, and by Proposition [2 so is F. 



5 FC-families verified 

Having established all the necessary mathematics, in this Section we prove that 
certain uniform families are FC-families (mainly by performing verified calcu- 
lations). First, we calculate non-equivalent representatives for fams 533 , fams 634 , 
and fams 734 . 

Lemma 8. The first column of Table[l\ contains (respectively) all elements of: 

nef per m [0..<5] fams 533 ; 

nef perm [0..<6] ( f ;| ter ^ R ^^33 f) f a m S 634 ), 

nefP er m [o..<7] ( fNter ( XR ^ check533 FA^check 63 4 F) fams 734 ), 
where perm I is the function that generates all permutations of a list I, checks33 
is a function that checks if any 3 of the 4 given 3- element sets are have their 
union contained in a 5-element set, and checkg34 is a function that checks if the 
union of 4 given 3-element sets is contained in a 6-element set^\ 

7 Formal definition of these functions is not given here and is available in the Is- 
abelle/HOL proof documents, along with correctness arguments. 
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Table 1. Families and weights 



Proof. By calculations performed by a computer. 

Next, we show that all these representatives have non-negative shares. 
Lemma 9. For all F c and w given in Tabled it holds that ssn F c w = _L. 
Proof. By calculations performed by a computer. 

Finally, the main result can be easily proved. 
Theorem 3. The following are FC-families: 

1. all families containing one 1-element set (i.e., {{a}}); 

2. all families containing one 2-element set (i.e., {{a, b}}, for a ^ b); 

3. all families containing 3 3-element sets whose union is contained in a 5- 
element set (i.e., uniform bZi-families) ; 

4- all families containing 4 3-element sets whose union is contained in a 6- 

element set (i.e., uniform 63A-families) ; 
5. all families containing 4 3-element sets whose union is contained in a 7- 

element set (i.e., uniform 7ZA-families). 

Proof. The case 1 trivially holds (since for each family member A that does not 
contain a, there is a member A U {a} that contains a). 

Other proofs are based on the techniques described in this paper. By Propo- 
sition[5]it suffices to consider only families F such that [JF C {0, 1, . . . , n — 1}. 
All families corresponding to rows in Table Q] are FC-families. Indeed, for each 
F c and w given in a table row, by Lemma[9]it holds that ssn F c w. Therefore, by 
LemmaHfor all F' G uce F c it holds that w il)Fc) (F' ) > 0. Then, F c is FC-family 
by Theorem [T] 

In the case 2 this completes the proof. 



In the case 3 the statement holds by Lemma [71 since, by Lemma [5] four rows 
given in Table [T] correspond to four non-equivalent families. 

To show the case 4, let F c be any family containing 4 3-element sets whose 
union is contained in {0, 1, . . . , 5} and let F be a union-closed family such that 
F D F c . If checl<533 F c holds (i.e., if union of any 3 members of F c is contained 
in a 5-element set), then F is Frankl's by case 3. If -ichecl<533 F c holds, then 
F c is in filter (AF.-ichecl<533 F) fams 634 . The statement then holds by LemmaUJ 
since, by Lemma [8] two rows given in Table [1] correspond to two non-equivalent 
families of filter (A^.-.check 53 3 F) fams 634 . 

The case 5 is proved similarly, using the proofs for both the case 3 and the 
case 4. 

6 Conclusions and further work 

In this paper, we have formalized (within Isabelle/HOL) a computer-assisted 
approach of Zivkovic and Vuckovic for verifying FC-families. Well-known FC- 
families are confirmed and a new uniform FC-family is discovered. 

The Isabelle/HOL formalization has around 260KB of data organized into 
around 6500 lines of Isabelle/Isar proof text. Ratio between the size of the for- 
malization and the size of the corresponding pen and paper proof (DeBruijn 
index) is estimated at around 5.5. Total time required to do the formalization is 
very roughly estimated at around 200 man/hours (25 full working days spread 
over a period of around 8 months). 

Total proof checking time of Isabelle/HOL takes around 28 minutes on a note- 
book PC with 2.1GHz Intel/Pentium CPU and 4GB RAM. The major fraction 
of this time (around 23 minutes) is spent in the combinatorial search. Check- 
ing Lemma [9] consumes most of this time, and its last 8 cases (related to the 
uniform- 734 families) alone take 22.8 minutes. This is quite long compared to the 
original JAVA programs (that perform the whole combinatorial search in around 
1 minute), but still bearable. The big difference is due to the use of machine- 
integers supporting atomic bitwise-or in JAVA and the use of big-integers that do 
not support atomic bitwise-or in Isabelle/ML. The search time could be reduced 
if machine-integers were also used in Isabelle/ML. In a simple approach, the 
code generator could be instructed to replace mathematical integers in the for- 
malization by machine-integers in the code, but that would make a gap between 
the formalization and the generated code and would require trusting that no 
overflows occur. A better approach would require formalizing machine-integers 
and their properties and using them within the formalization itself. 

Compared to the prior pen-and-paper work, the computer assisted approach 
significantly reduces the complexity of mathematical arguments behind the proof 
and employs computing-machinery in doing its best — quickly enumerating and 
checking a large search space. This enables formulation of a general framework 
for checking various FC-families, without the need of employing human intel- 
lectual resources in analyzing specificities of separate families. Compared to the 
work of Zivkovic and Vuckovic, apart from achieving the highest level of trust 



possible, the significant contribution of the formalization is the clear separation 
of mathematical background and combinatorial search algorithms, not present 
in earlier work. Also, separation of abstract properties of search algorithms and 
technical details of their implementation significantly simplifies reasoning about 
their correctness and brings them much closer to classic mathematical audience, 
not inclined towards computer science. 

This work represents a significant part in formally proving the Frankl's con- 
jecture for families F such that | (J F\ < 11, and | (J F\ < 12 (already informally 
done by Zivkovic and Vuckovic \17\ ) which in the focus of our current and future 
work. We also plan to investigate other FC-families (not necessarily uniform). 
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